Written by: Lakshya Mathur
Excel-based malware has been around for decades and has been in the limelight in recent years. During the second half of 2020, we saw adversaries using Excel 4.0 macros, an old technology, to deliver payloads to their victims. They were mainly using workbook streams via the XLSX file format. In these streams, adversaries were able to enter code straight into cells (that’s why they were called macro-formulas). Excel 4.0 also used API level functions like downloading a file, creation of files, invocation of other processes like PowerShell, cmd, etc.
With the evolution of technology, AV vendors started to detect these malicious Excel documents effectively and so to have more obfuscation and evasion routines attackers began to shift to the XLSM file format. In the first half of 2021, we have seen a surge of XLSM malware delivering different family payloads (as shown in below infection chart). In XLSM adversaries make use of Macrosheets to enter their malicious code directly into the cell formulas. XLSM structure is the same as XLSX, but XLSM files support VBA macros which are more advanced technology of Excel 4.0 macros. Using these macrosheets, attackers were able to access powerful windows functionalities and since this technique is new and highly obfuscated it can evade many AV detections.
Excel 4.0 and XLSM are both known to download other malware payloads like ZLoader, Trickbot, Qakbot, Ursnif, IcedID, etc.
The above figure shows the Number of samples weekly detected by the detected name “Downloader-FCEI” which specifically targets XLSM macrosheet based malware.
Detailed Technical Analysis
XLSM Structure
XLSM files are spreadsheet files that support macros. A macro is a set of instructions that performs a record of steps repeatedly. XLSM files are based upon Open XLM formats that were introduced in Microsoft Office 2007. These file types are like XLSX but in addition, they support macros.
Talking about the XLSM structure when we unzip the file, we see four basic contents of the file, these are shown below.
- _rels contains the starting package-level relationship.
- docProps contains the metadata of the excel file.
- xl folder contains the actual contents of the file.
- [Content_Types].xml has references to the XML files present within the above folders.
We will focus more on the “xl” folder contents. This folder contains all the excel file main contents like all the worksheets, media files, styles.xml file, sharedStrings.xml file, workbook.xml file, etc. All these files and folders have data related to different aspects of the excel file. But for XLSM files we will focus on one unique folder called macrosheets.
These XLSM files contain macrosheets as shown in figure-2 which are nothing but XML sheet files that can support macros. These sheets are not available in other Excel file formats. In the past few months, we have seen a huge surge in XLSM file-type malware in which attackers store malicious strings hidden within these macrosheets. We will see more details about such malware in this blog.
To explain further how attackers uses XLSM files we have taken a Qakbot sample with SHA 91a1ba70132139c99efd73ca21c4721927a213bcd529c87e908a9fdd71570f1e.
Infection Chain
The infection chain for both Excel 4.0 Qakbot and XLSM Qakbot is similar. They both downloads dll and execute it using rundll32.exe with DllResgisterServer as the export function.
XLSM Threat Analysis
On opening the XLSM file there is an image that prompts the user to enable the content. To look legitimate and clean malicious actors use a very official-looking template as shown below.
On digging deeper, we see its internal workbook.xml file.
Now as we can see in the workbook.xml file (Figure-5), there is a total of 6 sheets and their state is hidden. Also, two cells have a predefined name and one of them is Sheet2323!$A$1 defined as “_xlnm.Auto_Open” which is similar to Sub Auto_Open() as we generally see in macro files. It automatically runs the macros when the user clicks on Enable Content.
As we saw in Figure-3 on opening the file, we only see the enable content image. Since the state of sheets was hidden, we can right-click on the main sheet tab and we will see unhide option there, then we can select each sheet to unhide it. On hiding the sheet and change the font color to red we saw some random strings as seen in figure 6.
These hidden sheets contain malicious strings in an obfuscated manner. So, on analyzing more we observed that sheets inside the macrosheets folder contain these malicious strings.
Now as we can in figure-7 different tags are used in this XML sheet file. All the malicious strings are present in two tags <f> and <v> tags inside <sheetdata> tags. Now let’s look more in detail about these tags.
<v> (Cell Value) tags are used to store values inside the cell. <f> (Cell Formula) tags are used to store formulas inside the cell. Now in the above sheet <v> tags contain the cached formula value based on the last time formula was calculated. Formula cells contain formulas like “GOTO(Sheet2!H13)”, now as we can see here attackers can store different formulas while referencing cells from different sheets. These operations are done to produce more and more obfuscated sheets and evade AV signatures.
When the user clicks on the enable content button the execution starts from the Auto_Open cell, after which each sheet formula will start to execute one by one. The final deobfuscated string is shown below.
Here the URLDownloadToFIleA API is used to download the payload and the string “JJCCBB” is used to specify data types to call the API. There are multiple URI’s and from one of them, the DLL payload gets downloaded and saved as ..\\lertio.cersw. This DLL payload is then executed using rundll32. All these malicious activities get carried out using various excel based formulas like REGISTER, EXEC, etc.
Coverage and prevention guidance:
McAfee’s Endpoint products detect this variant of malware as below:
The main malicious document with SHA256 (91a1ba70132139c99efd73ca21c4721927a213bcd529c87e908a9fdd71570f1e) is detected as “Downloader-FCEI” with current DAT files.
Additionally, with the help of McAfee’s Expert rule feature, customers can add a custom behavior rule, specific to this infection pattern.
Rule {
Process {
Include OBJECT_NAME { -v “EXCEL.exe” }
}
Target {
Match PROCESS {
Include OBJECT_NAME { -v “rundll32.exe” }
Include PROCESS_CMD_LINE { -v “* ..\\*.*,DllRegisterServer” }
Include -access “CREATE”
}
}
}
McAfee advises all users to avoid opening any email attachments or clicking any links present in the mail without verifying the identity of the sender. Always disable the Macro execution for Office files. We advise everyone to read our blog on these types of malicious XLSM files and their obfuscation techniques to understand more about the threat.
Different techniques & tactics are used by the malware to propagate, and we mapped these with the MITRE ATT&CK platform.
- T1064(Scripting): Use of Excel 4.0 macros and different excel formulas to download the malicious payload.
- Defense Evasion (T1218.011): Execution of Signed binary to abuse Rundll32.exe and proxy executes the malicious code is observed in this Qakbot variant.
- Defense Evasion (T1562.001): Office file tries to convince a victim to disable security features by using a clean-looking image.
- Command and Control(T1071): Use of Application Layer Protocol HTTP to connect to the web and then downloads the malicious payload.
Conclusion
XLSM malware has been seen delivering many malware families. Many major families like Trickbot, Gozi, IcedID, Qakbot are using these XLSM macrosheets in high quantity to deliver their payloads. These attacks are still evolving and keep on using various obfuscated strings to exploit various windows utilities like rundll32, regsvr32, PowerShell, etc.
Due to security concerns, macros are disabled by default in Microsoft Office applications. We suggest it is only safe to enable them when the document received is from a trusted source and macros serve an expected purpose.